Gossip learning with linear models on fully distributed data
نویسندگان
چکیده
Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications. In this model we have one data record at each network node, but without the possibility to move raw data due to privacy considerations. For example, user profiles, ratings, history, or sensor readings can represent this case. This problem is difficult, because there is no possibility to learn local models, the system model offers almost no guarantees for reliability, yet the communication cost needs to be kept low. Here we propose gossip learning, a generic approach that is based on multiple models taking random walks over the network in parallel, while applying an online learning algorithm to improve themselves, and getting combined via ensemble learning methods. We present an instantiation of this approach for the case of classification with linear models. Our main contribution is an ensemble learning method which—through the continuous combination of the models in the network—implements a virtual weighted voting mechanism over an exponential number of models at practically no extra cost as compared to independent random walks. We prove the convergence of the method theoretically, and perform extensive experiments on benchmark datasets. Our experimental analysis demonstrates the performance and robustness of the proposed approach.
منابع مشابه
Efficient P2P Ensemble Learning with Linear Models on Fully Distributed Data
Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications. In this model we have one data record at each network node, but without the possibility to move raw data due to privacy considerations. For example, user profiles, ratings, history, or sensor readings can represent this case. This problem is difficult, because there is no possibility to l...
متن کاملDisTriB: Distributed Trust Management Model Based on Gossip Learning and Bayesian Networks in Collaborative Computing Systems
The interactions among peers in Peer-to-Peer systems as a distributed collaborative system are based on asynchronous and unreliable communications. Trust is an essential and facilitating component in these interactions specially in such uncertain environments. Various attacks are possible due to large-scale nature and openness of these systems that affects the trust. Peers has not enough inform...
متن کاملDisTriB: Distributed Trust Management Model Based on Gossip Learning and Bayesian Networks in Collaborative Computing Systems
The interactions among peers in Peer-to-Peer systems as a distributed collaborative system are based on asynchronous and unreliable communications. Trust is an essential and facilitating component in these interactions specially in such uncertain environments. Various attacks are possible due to large-scale nature and openness of these systems that affects the trust. Peers has not enough inform...
متن کاملDifferentially Private Linear Models for Gossip Learning through Data Perturbation
Privacy is a key concern in many distributed systems that are rich in personal data such as networks of smart meters or smartphones. Decentralizing the processing of personal data in such systems is a promising first step towards achieving privacy through avoiding the collection of data altogether. However, decentralization in itself is not enough: Additional guarantees such as differential pri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Concurrency and Computation: Practice and Experience
دوره 25 شماره
صفحات -
تاریخ انتشار 2013